Marginalized mixture models for count data from multiple source populations
نویسندگان
چکیده
Mixture distributions provide flexibility in modeling data collected from populations having unexplained heterogeneity. While interpretations of regression parameters from traditional finite mixture models are specific to unobserved subpopulations or latent classes, investigators are often interested in making inferences about the marginal mean of a count variable in the overall population. Recently, marginal mean regression modeling procedures for zero-inflated count outcomes have been introduced within the framework of maximum likelihood estimation of zero-inflated Poisson and negative binomial regression models. In this article, we propose marginalized mixture regression models based on two-component mixtures of non-degenerate count data distributions that provide directly interpretable estimates of exposure effects on the overall population mean of a count outcome. The models are examined using simulations and applied to two datasets, one from a double-blind dental caries incidence trial, and the other from a horticultural experiment. The finite sample performance of the proposed models are compared with each other and with marginalized zero-inflated count models, as well as ordinary Poisson and negative binomial regression.
منابع مشابه
Analysis of zero-inflated clustered count data: A marginalized model approach
Min and Agresti (2005) proposed random effect hurdle models for zero-inflated clustered count data with two-part random effects for a binary component and a truncated count component. In this paper, we propose new marginalized models for zero-inflated clustered count data using random effects. The marginalized models are similar to Dobbie and Welsh’s (2001) model in which generalized estimating...
متن کاملApplication of finite mixture models for vehicle crash data analysis.
Developing sound or reliable statistical models for analyzing motor vehicle crashes is very important in highway safety studies. However, a significant difficulty associated with the model development is related to the fact that crash data often exhibit over-dispersion. Sources of dispersion can be varied and are usually unknown to the transportation analysts. These sources could potentially af...
متن کاملAn Overview of the New Feature Selection Methods in Finite Mixture of Regression Models
Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...
متن کاملCollapsed Variational Dirichlet Process Mixture Models
Nonparametric Bayesian mixture models, in particular Dirichlet process (DP) mixture models, have shown great promise for density estimation and data clustering. Given the size of today’s datasets, computational efficiency becomes an essential ingredient in the applicability of these techniques to real world data. We study and experimentally compare a number of variational Bayesian (VB) approxim...
متن کاملZero-inflated Poisson regression mixture model
Excess zeros and overdispersion are commonly encountered phenomena that limit the use of traditional Poisson regression models for modeling count data. The focus of this paper is on modeling count data in the case that a population has excess zero counts and also consists of several sub-populations in the non-zero counts. The proposed zero-inflated Poisson regression mixture model accounts for ...
متن کامل